Method and apparatus for encoding image and or audio data

ABSTRACT

There is disclosed method and apparatus for structured encoding of a previously encoded source ( 100, 105, 140 ) of data, where the structure ( 200, 210, 220, 230 ) is not defined in the received data. The invention finds particular application in block-based compression of digitised image or audio data derived from analogue sources, for example using MPEG encoding. The encoding introduces discontinuities in pixel colour and/or brightness across the block boundaries ( 200, 210, 220, 230 ), the introduction of which can lead to a marked deterioration in quality, and inefficient use of bandwidth. Encoding data using the same block and pixel structure used previously renders the discontinuities effectively invisible, substantially eliminating these problems. To do so, the received data is processed ( 300 ) to detect artefacts contained within the previously encoded and decoded data, information as to the structure ( 200, 210, 220, 230 ) imposed on the data by the previous encoding process ( 100, 105, 140 ) is extracted by analysis of the artefacts, and the received data is encoded by reference to the extracted structure information.

The invention relates to method and apparatus for encoding of datareceived from a source, wherein the encoding is of a type which imposesa structure on the data, which structure is not defined in the data asreceived. The invention finds particular application in block-basedcompression of digitised image or audio data derived from analoguesources, for example using MPEG encoding.

As is well known, images, and particularly motion picture sequences fortelevision and video recording applications, can be transmitted andstored in either analogue or digital formats. Digital transmission andstorage is becoming increasingly practicable, both for professional andconsumer applications. It is commonly necessary to digitise and encodeimages from analogue sources for transmission or storage, and viceversa. These may be still images, such as those generated in digitalphotography or scanned from a film or paper, or a stream of imagesforming a motion picture sequence. Digital video from a camera orrecording may be converted to analogue form for broadcast and thenconverted to digital form again for storage, such as on a domesticdigital video recorder (DVR) apparatus.

Digital transmission and storage systems generally use block-basedcompression, such as JPEG or MPEG-2, to achieve acceptable image qualitywithin the available transmission bandwidth and storage capacity. JPEGis a video compression system based upon performing Discrete CosineTransformation (DCT) on groups, or blocks, of pixel data. MPEG-2 is amotion video compression system based upon the same principles. Toachieve substantial data compression, the DCT coefficients representingeach block of pixels are subjected to adaptive quantisation and VariableLength Encoding (VLE). Blocks are also grouped together in fours, toform “Macroblocks”, so that chrominance (colour) components can berepresented with half the spatial resolution provided for luminance(brightness) component. These techniques can be applied in both stillimages (JPEG) and motion video (MPEG). For moving pictures,motion-compensated inter-frame predictive encoding is performed on amacroblock basis, to achieve further compression.

Due to the quantisation, these compression systems are “lossy” systems,whereby encoded data, after decoding, is not identical to the originaldata before encoding. This may manifest itself as differences in pixelluminance and/or chrominance, all generally appearing as noise in thereconstructed image. A particularly noticeable form of noise inblock-based compression systems such as JPEG and MPEG, is the appearanceof discontinuities in pixel colour and/or brightness across the blockboundaries. These artefacts will be referred to herein as “block noise”.The human eye is very sensitive to abrupt changes in contrast such asthis, the appearance occurring in the form of a grid-like patternsuperimposed upon a normal, moving image. EP 0998146 A for exampledescribes apparatus for detecting block noise and smoothing thediscontinuities at the block boundaries, to minimise the obtrusivenessof the block boundaries in the viewed image.

Compression encoders generally implement a continuous trade-off betweenimage quality and transmission bandwidth or file size. The picturequality available depends heavily on the content and also the quality ofthe source image. Noise in the source image leads to a markeddeterioration in quality, as the random features are inherently morecostly to represent than the more coherent signals for which the systemis designed. On the other hand, repeatedly decoding and then re-codingimages that have been encoded by these methods does not necessarilyresult in greater degradation, because the remaining information isalready adapted to what the re-encoding process can reproduce within theavailable bandwidth. Although the image being re-encoded may containnoticeable block noise, for example, because each block is treatedseparately by the DCT process, these artefacts may be reproduced in there-encoded image, but they will not be compounded, nor consume anyadditional bandwidth, as they are effectively “invisible” to there-encoder.

The inventor has recognised a problem, however, where decoded imagescontaining block noise are transmitted or stored in analogue form, andare then supplied to the encoder for digital transmission or storage. Inthis case, there will generally be no alignment between the block noiseartefacts present in the source image and the block boundaries appliedby the encoder. Accordingly, the encoder will “see” the block noise aspart of the signal to be encoded. Then, not only will the block noise bereproduced in the encoded image, the bandwidth required to representthese sharp discontinuities within the encoder's pixel blocks willreduce the bandwidth available to represent the true image content,leading to a marked degradation in image quality. On decoding the image,two sets of block noise will be included, and any further transmissionby an analogue channel and re-encoding will compound the problemfurther.

When handling motion video, according to a block-based encoding methodsuch as MPEG-2, a sequence of frames is encoded as a notional Group OfPictures (GOP) employing differing coding schemes. The schemes typicallycomprise intra-coding “I” frames which are coded only using informationfrom itself (similar to JPEG), predictive coding “P” frames which arecoded using motion vectors based on a preceding I-frame; andbi-directional predictive coding “B” frames, which are encoded byprediction from I and/or P frames before and after them in sequence. Thechoice of coding schemes and the order in which they are sequenceddepends upon the integrity of the communication medium being used toconvey the motion video. For example, if there is a high risk ofcorruption, it may be decided to repeat a greater number of “I” framesin a GOP than would be used for a more secure link, so that uponinterruption an image can quickly be reconstructed.

Ideally, to achieve greatest compression and minimise degradationthrough decoding and re-coding steps, the same GOP sequence would beused by all encoding stages. EP 0106779 A seeks to send “history” datawith digital video signals, so that re-encoding can be performed withregard to the GOP structure of a predecessor data stream. Again,however, if the pictures have been through the analogue domain in themeantime, such history data is not available. When this happens, framesthat were originally I-frames may subsequently be encoded as B- orP-frames, and frames that were originally B- or P-frames maysubsequently be encoded as an I-frame. This will generally result in aloss of picture quality, which would be compounded if the decoding andre-coding process were repeated.

Similar issues arise in the encoding of audio data from an analoguesource, which may have been compressed previously. For example, manyaudio compression systems divide the audio sample stream into shortblocks similar to blocks of pixels but in one dimension only, and encodeeach block in terms of its spectral content. In this case, the blocksrepresent temporal structure rather than spatial structure, but thepresence of block boundary artefacts, and the problems of bandwidthstealing give rise to analogous problems to those described above.

Accordingly, it is an object of the invention to provide improvedmethods and apparatus for performing block-based encoding data such asimages and sounds derived from analogue sources, particularly methodsthat can preserve the quality of images/sounds that have been previouslyblock-based encoded and contain block noise or other structuredartefacts.

According to a first aspect of the present invention, there is provideda method of encoding of data received from a source, wherein theencoding is of a type which imposes a structure on the data, whichstructure is not defined in the data as received, the method comprisingthe steps of:

-   -   analysing the received data to detect artefacts contained within        the data indicating that the data has been through a previous        encoding and decoding process of the same type;    -   extracting by analysis of said artefacts information as to the        structure imposed on the data by said previous encoding process;    -   encoding the received data by reference to the extracted        structure information.

The encoding step may be performed so as to maximise alignment betweenthe structure imposed by the encoding process and that imposed by theprevious encoding process.

As will be seen from the following examples, using the same structure aswas used before allows images or audio data to propagate through asystem involving multiple encoding/decoding stages with reduceddegradation of quality. A particular advantage is avoiding consumptionof bandwidth by the unnecessary encoding of artefacts from the previousencoding process.

Where the received data represents an image, such as an image receivedthrough an analogue transmission or storage process, the structureimposed by the encoding process may include a spatial structure in whichpixels of the image are processed in blocks, the encoding beingperformed so as to align block boundaries of the encoding processsubstantially with block boundary artefacts present in the receivedimage data as a consequence of the previous encoding process.

The encoding process may be of a type which imposes a spatial structurein which the blocks of pixels are grouped into macroblocks. In such acase, the encoding may be performed so as to align macroblock boundariesof the encoding process substantially with macroblock boundary artefactspresent in the received image data as a consequence of the previousencoding process. In JPEG- or MPEG-derived image data, macroblockboundary artefacts can be detected only in the chrominance components ofthe image data, as opposed to the luminance data. The term “block”should be interpreted as including “macroblock”, except where thecontext requires otherwise.

In cases where the relative resolution between chrominance and luminancecomponents of the image is not fixed in advance, the detection of blockboundary artefacts separately in chrominance and luminance componentswill also allow determining the relative resolution as a preliminarystep. This can then be used to set up the encoder with the sameparameters, alternatively or (preferably) in addition to aligning theblock boundaries in the manner described above.

The received image data may (additionally) be a motion picture sequenceof images. In this case, the structure information used for eachsuccessive image may be derived entirely by analysis of the presentimage, entirely from a previous image, or from a combination of previousand present images. These embodiments can be selected according to thecircumstances. The first option allows for jitter in the structure fromframe to frame, but may have difficulty in identifying the structurewhere the content of the image data is such that it lacks strongartefacts in a given frame (such as a blank image between scenes). Thesecond option can avoid this problem, while still allowing the encoderto adapt to a slower drift in the structure of the artefacts relative tothe received image data.

The step of analysing the received data may include storing all or atleast a substantial part of an image and performing spectral analysis toidentify periodic components indicating the presence of block boundaryartefacts. The step of extracting structure information may compriseanalysing said image to determine the spacing (frequency) and location(phase) of those artefacts. If the image data is stored for analysis inan image store, the spectral analysis may comprise applying a FastFourier Transform (FFT) to the stored data.

The encoding step may be performed by separate steps of pre-processingthe data to produce data having a standardised structure. This allows ageneric encoding process (software and/or hardware) to be appliedwithout modification. For example, in an MPEG encoding process theencoder generally applies a block/macroblock structure of 8×8/16×16pixels, starting at the top left pixel of the image. Said pre-processingstep may be performed by re-sampling the image data entirely in thedigital domain. Filtering may be applied to interpolate pixel values forthis purpose. The received image data may be over-sampled when initiallydigitised from the analogue, to minimise loss of quality in thisre-sampling step.

The re-sampling may be performed on an entire image before encodingbegins, or it may be performed during read-out of pixel data forencoding.

Where the received image data represents a motion picture sequence, thestructure imposed by the encoding process may be a temporal structure(GOP structure) in which different images of the sequence are processeddifferently, the encoding being performed so as to apply substantiallythe same GOP structure to the sequence as was applied in the previousencoding process. Alternatively, the encoding may be performed so as toapply a different GOP structure to, but temporally associated with, thatused in the previous encoding process. In particular, the analysis ofartefacts may distinguish between intra- and inter-coded pictures.

The analysis of GOP structure may be performed by analysing severalimages stored in full in a memory, or it may be performed by preservingonly parameters of past images and analysing the present image withrespect to those parameters. It may be that the GOP structure is onlyrecognised after analysing several frames of the sequence. Intra-codedpictures will typically arise on a fairly regular basis and contain morehigh-frequency components, and can be identified in this way. Note thatthe DCT apparatus of the encoding process could be used to measure thehigh frequency components. On the other hand, it may be simpler toprovide separate filters for this purpose, to retain the generic encoderand to reduce design effort and uncertainty. The designer can choosewhether to delay encoding until the GOP structure has been determined,or to encode initially without reference to the GOP structure. Ifdesired, alignment of the structures could begin when sufficientinformation becomes available. Clearly the latter option will bepreferred, especially when feeding TV transmissions for simultaneousdisplay, where video segments with and without coding artefacts may befreely edited together.

The received data may alternatively comprise audio data. The structureimposed by the encoding process may include a temporal structure inwhich samples of an audio signal are processed in blocks, eachrepresenting a short time interval, the encoding being performed so asto maximise alignment of block boundaries of the encoding processsubstantially with block boundary artefacts present in the receivedaudio data as a consequence of the previous encoding process. Theprinciples applied in the embodiments of image processing describedabove and below can be adapted generally to the audio encoding process.One difference is that audio data is one-dimensional and continuous,rather than two-dimensional data organised in separate image frames thatcan be processed, if desired, in isolation from one another. The methodsadopted for an audio stream will therefore be of the continuous varietyin which the existence and position of artefacts will be detected on anon-going basis and the encoding step will be adapted on an on-goingbasis to maximise alignment of the block boundaries over time, ratherthan in every part of the data stream.

In the case of audio data, therefore, the analysis step may include aphase-locked loop (PLL) process which is attuned to detect and then lockon to block boundary artefacts in a continuous data stream. The encodingstep may then include a second phase-locked loop or similar process formaximising alignment of the block boundaries of the encoding processwith the detected block boundary artefacts gradually over time, to avoidsudden discontinuities in the block structure imposed by the encodingstep.

The invention further provides an apparatus for encoding data, theapparatus being adapted to implement the method according to theinvention as set forth above.

The apparatus may comprise a digital video recorder or digital audiorecorder, as appropriate.

As mentioned above, the invention may be implemented usingpre-processing and a generic encoding process or processing apparatus.

Accordingly, the invention yet further provides a method ofpre-processing data received from a source, for subsequent applicationto an encoding process which imposes a structure on the data, whichstructure is not defined in the data as received, the method comprisingthe steps of:

-   -   analysing the received data to detect artefacts contained within        the data indicating that the data has been through a previous        encoding process of the same type;    -   extracting by analysis of said artefacts information as to the        structure imposed on the data by said previous encoding process;    -   processing the received data by reference to the extracted        structure information so as to maximise alignment between the        structure imposed by the previous encoding process and a        predetermined structure.

A consumer having generic encoding equipment or software can then inprinciple add-on the pre-processing capability. The pre-processing couldalso be performed by broadcaster prior to transmitting the data as adigital TV or audio broadcast signal, such that subscribers havinggeneric encoding equipment can benefit from the invention withoutinvestment on their part.

The particular embodiments described above can be applied in this formof method. A pre-processing apparatus is similarly provided.

The invention yet further provides a computer program product comprisinginstructions for causing a programmable computer to implement thespecific method steps and/or apparatus features of the invention in anyof its aspects as set forth herein. The computer program product may besupplied independently of any computer hardware, and may supplied eitherin the form of a record carrier or in electronic form over a network.

Embodiments of the invention will now be described, by way of exampleonly, by reference to the accompanying drawings, in which:

FIG. 1 depicts an original image having smooth edges, prior toblock-based encoding;

FIG. 2 depicts the image of FIG. 1 after lossy block-based encoding;

FIG. 3 shows block noise prevalent in the real image that was depictedin FIG. 2;

FIG. 4 illustrates a typical system having a number of encoding andsubsequent decoding stages for transmitting analogue motion video fromsource to user across communication links having restricted bandwidth;

FIG. 5 illustrates the effect on block boundaries of an image havingpassed through the various stages (A, B, C) of the system of FIG. 4;

FIG. 6 illustrates an improved encoder of the present invention fordetecting encoding parameters, for subsequent use in block-basedencoding;

FIG. 7 is a block diagram of the Boundary Edge Detector of the encoderof FIG. 6;

FIG. 8 shows some detectable boundaries that might exist in a typicalblock-based encoded image;

FIG. 9 shows the detectable boundaries of FIG. 8 that the Boundary EdgeDetector of FIG. 7 has interpolated between to form an encoding grid;and

FIG. 10 shows derivation of pixel clock from detected and interpolatedblock boundaries.

It has, and will remain to be, a goal of designers of image processingsystems to minimise the quantity of noise introduced into a signal as itprogresses through the system.

Various techniques exist for the suppression of noise within a videoimage, before display. For example, a low-pass filter will reduce theabruptness of any high-frequency (and therefore noticeable) transitions,making the image more visually acceptable. However, doing so will alsoreduce the bandwidth of the entire image, resulting in a less sharp andtherefore degraded image.

Alternatively, it is preferred to minimise the generation of noiseitself, rather than to try to suppress it once it has entered thesystem. Various screening techniques currently exist to minimise asystem picking up noise, but it is more of a challenge to minimise thegeneration of noise by the system itself. Image compression usingblock-based encoding actually self-generates an amount of noise, whichcan propagate and in certain circumstances be accentuated as the signalprogresses through the system.

FIG. 1 depicts a derived image prior to block-based encoding. The linesdepict regions of high contrast change. Lines and curves are smooth.(The original image from which this was derived also exhibited a widedynamic tonal range).

FIG. 2 depicts the image of FIG. 1 after it has been compressed to areduced file size, using block based encoding such as JPEG. As before,the lines depict points of high contrast. The skilled reader willappreciate that if the image was one selected from a motion videosequence then the compression used may have been MPEG encoding. Becausethe encoding scheme is “lossy”, a number of artefacts have beenintroduced into the image. For example, sharp objects now protrude intothe lines. The smooth lines have been replaced by jagged edges.

The wide tonal range of the original image would be replaced by smallsquare blocks of uniform tone (not shown). As a result, a smoothtransition of tone across a selected area is now replaced by steps ofdifferent uniform tonal values. Some of the steps between blocks are ofsufficiently large difference to be noticeable within the image.

FIG. 3 is the image depicted by FIG. 2, after being processed by an edgedetector. This image was derived by detecting points of high contrastbetween adjacent pixels. If the process was performed on the originalimage as depicted by FIG. 1 then it would be fairly similar to the FIG.1 as shown. However, when performed on the image that has beenblock-based encoded, as depicted by FIG. 2, in addition to the baseimage one can observe clearly defined blocks of equal size and shape.The blocks relate to pixel groups of 8 by 8 pixels, and are know as“Block Noise”, because it occurs at detectable transitions betweenblocks.

A block-based compression scheme reduces the size of an image file(and/or the bandwidth required to transmit the image across alimited-bandwidth carrier) by separately encoding regions within theimage. Each block is processed to eliminate components of the signalthat are not essential for conveying the image (generally highfrequencies). A motion sequence is further compressed by onlytransmitting image data that has changed relative to the previous frame.Cumulative errors are reduced by sending a fresh, reference frame atregular intervals. The means by which motion video is processed aredescribed later.

The blocks within each image are visible because reconstruction fordisplay of each pixel within each block is now only an approximation ofits original value. This is because some of the data used to reconstructthe block has been discarded by the encoding process. The greater thecompression selected, the greater the resultant approximation of eachpixel value within the block. Adjacent blocks will become visiblebecause the smooth gradation between pixels in the original image hasbeen replaced by steps between pixel values. Varying deviation of pixelvalue about its original value contributes to making the steps morevisible.

FIG. 4 illustrates a typical video production, processing anddistribution system. A multimedia source 100 is filmed 105, and passedto studio 110 for processing. The video is subsequently transmitted 120and received 130 within a domestic environment, for decoding 140 anddisplay 150. Optionally, the video can be recorded 160 for laterviewing. The system includes a number of block-based encoding andsubsequent decoding stages (A, B, C) for transmitting motion videowithin the system across communication links having restrictedbandwidth.

In the example shown, the multimedia source 100 is filmed by an outsidebroadcast unit and the resultant analogue video recorded onto videotape. The video recorder uses MPEG encoding to compress the video, toprovide sufficient recording time using a small cassette. This is thefirst stage (A) of block-based encoding in the example system. Thevideotape 105 is then transferred to the studio 110, where it is decodedback into analogue video. At this point a number of artefacts areintroduced into the analogue video, as a result of the inefficiencies ofthe prior encoding and subsequent decoding process.

Once the video has been processed by the studio, for example by mixingwith other multimedia content, the signal is transmitted 120 to theconsumer 130. The transmission involves a further stage (B) ofblock-based encoding, such as MPEG-2, as the bandwidth of eachtransmission channel may be restricted. The consumer receives thesignal, which is then decoded 140 to provide analogue video VID fordisplay by a monitor 150. The consumer may wish to record the videobeing displayed on the monitor, and has a cassette-less recording device160, such as one using a hard drive to store digitised video. Video VIDis compressed once again (C) using block-based encoding, to maximise thecapacity of the hard drive. When subsequently displayed, the video isplayed back and decoded in similar fashion to the previous two stages.

The video information passing through this system has to pass throughthree stages (A, B, C) of block-based encoding and subsequent decoding,where the signal is conveyed between stages in analogue form. As aresult of using analogue video, no information is passed between stagesthat would allow at each encoding stage the pixels of the same image tobe encoded according to the same rules, and therefore in exactly thesame manner as for previous encoding stages.

FIG. 5 illustrates the effect on block boundaries of an image havingpassed through the various stages (A, B, C) of the system of FIG. 4. Theunbroken 200 lines denote the block boundaries used by the firstencoding/decoding stage. The dashed lines 210, 220 and 230 denote theblock boundaries used by the subsequent encoding/decoding stages. Onecan observe that the block boundaries are located differently within theimage frame. This is because the locations of the block boundaries aredictated by various factors, such as clock speed, image size and imageoffset. Variances in timebase such as those caused by video taperecorder tape transport mechanisms environmental factors such astemperature may cause the boundaries to move relative to each other overa period of time, when the analogue signals are digitised.

The consequence of these varying boundaries is a reduction in quality ofthe images within the image sequence. This is because block boundaryartefacts introduced in previous stages of block-based encoding/decoding200 are then treated as meaningful image content data in any successiveencoding stages.

In seeking to solve the problem, the inventor has observed that encodingan analogue image using the same block and pixel structure as was usedin a previous encoding stage renders the block boundary artefactseffectively invisible to the encoder, which treats each block of pixelssubstantially as an independent unit. This significantly improves thequality of the images without impact upon bandwidth requirements,because artefacts introduced at the first stage of encoding will notconsume bandwidth by being treated as image content by further encodingstages.

The inventor has further found that it is possible to analyse ananalogue image to determine whether or not it has been previouslyencoded using a block-based image compression system and use results ofthe analysis to direct the encoding process.

FIG. 6 illustrates an improved encoder, performing the two principalfunctions of a) analysing the input analogue video IV to detect theencoding parameters used in a previous encoding stage, such as block andpixel boundaries and pixel clock, and b) using the detected encodingparameters to direct the block-based encoding of the input video.

A Boundary Edge Detector BED 300, is used for analysing input analoguevideo to determine the encoding parameters such as horizontal “H” andvertical “V” block boundaries within each image, and from theseboundaries deriving a pixel clock “CLK” that directly corresponds to thelocations of pixels within each block. Attempts have been previouslymade to analyse analogue video to suppress block noise, an example ofwhich is illustrated in EP 0998146A. The detectable horizontal andvertical block boundaries within a previously block-encoded video frameare used to suppress the block noise, but only adjacent these detectedboundaries.

The Boundary Edge Detector BED 300 includes a digitisation and storagefront end DIG/BUF 304, which is accessed both for analysis to determinethe boundary edges, and as a source of digital video data for theblock-based encoder.

In an embodiment where the controller also detects GOP structure fromartefacts in the received image data, then the controller may alsodirect the encoder to impose a corresponding GOP structure on the newencoding. The GOP structure would be conveyed via an interface betweenthe BED and the encoders controller (not shown). Alternatively, however,the information as to GOP structure may be used to influence the encoderas to GOP structure or quantisation strength, but not to dictate rigidlya GOP structure for the encoding process. MPEG encoding processes tendto require freedom to select the GOP structure, for example, to controlbandwidth.

The processing stages of the encoder comprise conventional stages of ablock-based encoder; these being Discrete Cosine Transform (DCT) 320,Quantisation (Q) 330, Run-Length Variable Length Encoder (RL-VLC) 340,Bitstream Buffer (BB) 350, Inverse Quantisation (IQ) 360, InverseDiscrete Cosine Transform (IDCT) 370 Motion Compensator (MC) 380, MotionEstimation (ME) 390, and frame memory buffer (BUF) 400. The outputstream OS is taken from the Bitstream Buffer BB 350, and corresponds toa stream of block-based encoded video data.

FIG. 7 is a block diagram a digital Boundary Edge Detector BED 300,where the images are digitised DIG 600, double-buffered by memories BUF610, 620, and processed by processor PROC 630 to derive block boundariesH, V and a pixel CLK. The processor could be a DSP, or FPGA solution.

The skilled person will appreciate that various techniques can be usedto analyse the image data to obtain the block boundary artefacts,including for example techniques explained in detail in EP 0998146A,mentioned in the introduction. In the improved encoder of the firstembodiment, the detected boundaries H and V and pixel clock CLK arespecifically used to standardise the structure of the image to onecompatible with the encoder. The encoder does not perform suppression ofblock noise adjacent to the boundaries. Instead, by employing an imagestore and boundary edge detector, it ensures that the encoding isperformed using the same boundaries as were used before. Doing soensures that each block is encoded using the same boundaries as theimage progresses though different encoding stages, eliminating theencoding of block boundaries as image data. The skilled person will,however, appreciate that this does not exclude introducing additionalmeans for suppressing block noise in a further embodiment.

The encoding stage is a conventional block-based encoder, such as onefor performing MPEG encoding of motion video. The encoder will beselectable to operate according to different display standards, such asVGA, or SVGA, although a further embodiment may include auto detectionof the video standard from a wide range of input video standards byanalysis of the timing influenced by the timing signals derived by thedetection of block boundaries and derivation of pixel clock.

Each frame of input video will contain a number of detectable boundariesthat Boundary Edge Detector BED 300 will be able to detect and use toderive all boundary edges.

FIG. 8 illustrates detectable boundaries within a single image frame.One can observe that gaps are present that thwart detection of a fullgrid. In the disclosure of European Patent EP 0998146A described above,it does not matter if the boundaries cannot be detected in theseregions, because there is no block noise within the gaps that need to besuppressed and therefore there is no need to derive a full grid.However, a full grid is required in the embodiments of the improvedencoder because precise timing is required for all blocks and pixelswithin each video frame.

FIG. 9 shows the image of FIG. 8, where the Boundary Edge Detector ofFIG. 7 has interpolated between the detectable boundaries (depicted bythe dashed lines) to form an encoding grid.

The digital BED 300 illustrated in FIG. 7 digitises the analogue imageat a suitable rate and stores it in a frame store. In accordance withNyquist theory, the digitisation rate may be in the order of two timesthe image bandwidth, or higher, depending upon the accuracy required bythe BED to correctly determine the true location of block boundarieswithin the image. The image is then processed (either as it is beingloaded into memory, or once a complete frame has been stored) to derivethe block structure. Methods for achieving this are well known, andinclude weighted filter kernels (small arrays of coefficients) that arepassed over the image. Double buffering may be applied as appropriate,to maintain continuity. In that case, as one buffer is being processedto derive the block and pixel structure, another is being loaded withthe next frame. The buffers switch at frame or field rate, dependingupon the video standard being processed. The pixel clock is provided bya frequency synthesiser, controlled by the processor and derived fromthe measured block structure.

FIG. 10 shows the detectable horizontal boundaries (H), the estimatedlocation for the undetectable boundaries (Hest), the boundaries derivedfor subsequent processing (Hder) and the pixel clock CLK, which isoutput from the processor, 630 and corresponds to the pixels within eachframe of input video. This clock is derived by digital synthesis withinthe digital processor core 630, although other methods are available. Asmall degree of variance is acceptable, provided that the clock does notstray close to pixel boundaries, where the setup and hold timing of theencoder video digitiser may become compromised.

The three derived horizontal boundary H, vertical boundary V and pixelclock timing CLK are used by the processor to align the block boundariesof the new encoding process with those used in the previous stage. Theyare used as base timing signals from which all other of the BED 300timing signals are derived. Therefore, as the input video's base timingchanges (for example, due to wow and flutter of a video tape duringplayback, or changes over a longer period of time), the timing of theprocessing will alter to suit, tracking the input timing on a continuousbasis.

The image is prepared for encoding by modifying the pixel structure toalign with the derived boundaries. This can be achieved in a number ofways, such as by applying a “Warp” function that re-samples the imageusing non-linear pixel mapping; or by modifying the read addressing whenextracting data from the framestore to pass to the encoder. The skilledperson will appreciate that the same result could be achieved bypre-processing during storage, by modifying the digitisation rate and/orwrite addressing parameters.

Significant changes in input timing, for example that caused byinterruption of the video signal, would introduce a small transitionperiod for settling, during which the timing is unlikely to be accurateand precise overlay of block boundaries would not be achieved.

Encoding the video using the same block boundaries and pixel clock aswere originally used in a prior encoding step ensures that the blockboundaries are not encoded as image data. Instead, they are artefactsthat are propagated but not exacerbated during successive encodingstages. As a result, the encoding of each block will involvepredominantly the same frequency components as were used in priorencoding stages. This would not have been possible if the location ofthe block boundary grid was approximate, where block boundaries would beencoded as image data. As a consequence, it is unlikely that the samelevel of compression would be achievable. Therefore, the size of a filecorresponding to each image would be increasing in size as the imagepropagates through the whole system, or, where bandwidth is limited, thelevel of compression as the image propagates through the whole systemwould steadily have to increase to fit into the limited availablebandwidth, the quality of the image therefore deteriorating betweensource and target.

It may be noted that MPEG-4 standards allow the block size to varywithin a single image, according to the properties of each region withinthe image. These variable block sizes sit on top of the original MPEGblock structure in a form of “quad tree”. BED 300 in such an embodimentmay be adapted to identify variable size blocks. Alternatively, BED 300may just be arranged to identify the smallest block structure within theimage and align the pixels to the by means of a clock. The encoder whichfollows BED 300 can then, if it is an MPEG-4 or similar encoder, imposea similar block structure, by virtue of its own analysis.

As a further embodiment, for motion video, it is possible to determinethe Group Of Pictures (GOP) structure from the input signal as towhether each image being analysed was encoded as an I-Frame, B-Frame orP-Frame. Unlike operating stand-alone as in the embodiment of FIG. 6, inthis embodiment the block-based encoder feeds parameters back to theBoundary Edge Detector BED 300 to supplement the analysis of each image.

The parameters used to differentiate between the different frames is asfollows: I-Frames will generally be better quality than P-Frames, whichin turn will generally be better than B-Frames. I-Frames generallycontain a higher quantity of high frequency content than P-Frames orB-Frames. I-Frames often occur at regular intervals within a GOPsequence, therefore there will be a detectable drop in the block noiseat this frequency, and an increase in high frequency image content.

Digitised audio data (PCM) would be processed in very similar fashion.An audio signal would be digitised at the appropriate rate (eitherfixed, or modified in the same manner as for video processing, describedabove), and the stream stored in a single dimension array. Analysiswould be performed on the stored data to derive block boundaryartefacts, and the appropriately aligned data passed to the audioencoder for subsequent encoding.

The other frames can be detected by searching for motion-attributedartefacts that exist in B-Frames or P-Frames, but not in I-Frames. Forexample, image tearing may be prevalent, where discontinuity existswithin moving objects.

The quantity of block noise in each frame is measured by the BoundaryEdge Detector BED 300, the frequency content of each frame can bederived by analysing the DCT coefficients produced by the encoder's DCT320, and motion attributes are derived by analysis of the pattern ofblock noise in a region of interest, analysing a portion of the imageitself to search for disjointed objects or by analysing the motion datawithin the encoder motion compensator MC 380 and/or motion engine ME390. These attributes are analysed by the improved encoder against eachframe, and used to derive a pattern that relates to the GOP sequence.

The derived GOP sequence is then used to set the GOP sequence for theencoding, or at least as a reference to influence the GOP sequence (forexample, synchronise every 12^(th) I-Frame, and allow the device that iscontrolling the encoder to select the rest of the GOP sequence).

The skilled reader will appreciate that numerous variations are possiblewithin the principles of the methods and apparatus described above.Accordingly it will be understood that the embodiments illustratedherein are presented as examples to aid understanding, and are notintended to be limiting on the scope of the invention claimed.

1. A method of encoding of data received from a source (100, 105, 140),wherein the encoding is of a type which imposes a structure (200, 210,220, 230) on the data, which structure is not defined in the data asreceived, the method comprising the steps of: analysing (300) thereceived data to detect artefacts contained within the data indicatingthat the data has been through a previous encoding and decoding process(105,110,140) of the same type; extracting by analysis of said artefactsinformation as to the structure imposed on the data by said previousencoding process; encoding the received data by reference to theextracted structure information.
 2. The method as claimed in claim 1,wherein the received data represents an image (IV), such as an imagereceived through an analogue transmission (120) or storage (160)process, the structure (200, 210, 220, 230) imposed by the encodingprocess including a spatial structure in which pixels of the image areprocessed in blocks, the encoding being performed so as to align blockboundaries of the encoding process substantially with block boundaryartefacts present in the received image data as a consequence of theprevious encoding process.
 3. The method as claimed in claims 1 or 2,wherein the encoding process is of a type which imposes a spatialstructure in which the blocks of pixels are grouped into macroblocks,the encoding being performed so as to align macroblock boundaries of theencoding process substantially with macroblock boundary artefactspresent in the received image data as a consequence of the previousencoding process.
 4. The method as claimed in any preceding claim,wherein the received image data is a motion picture sequence of imagesand the structure information used for each successive image is derivedentirely by analysis (300) of at least one of the previous and presentimages.
 5. The method as claimed in any preceding claim, wherein thereceived image data is over-sampled when initially digitised (600) froman analogue signal.
 6. The method as claimed in any preceding claim,wherein where the received image data represents a motion picturesequence, the structure imposed by the encoding process is a temporalstructure (GOP structure) in which different images of the sequence areprocessed differently, the encoding being performed so as to applysubstantially the same GOP structure to the sequence as was applied inthe previous encoding process.
 7. The method as claimed in any of claims1 to 6, wherein the encoding is performed so as to apply a different GOPstructure to, but temporally associated with, that used in the previousending process.
 8. The method as claimed in claims 6 or 7, wherein theanalysis of artefacts distinguishes between intra- and inter-codedpictures.
 9. The method as claimed in any of claims 6, 7 or 8, whereinthe analysis of GOP structure is performed by analysing several imagesstored in full in a memory (610, 620).
 10. The method as claimed in anyof claims 6, 7 or 8, wherein the analysis is performed by preservingonly parameters of past images and analysing the present image withrespect to those parameters.
 11. The method as claimed in any precedingclaim, wherein the received data comprises audio data, the structureimposed by the encoding process including a temporal structure in whichsamples of an audio signal are processed in blocks, each representing ashort time interval, the encoding being performed so as to maximisealignment of block boundaries of the encoding process substantially withblock boundary artefacts present in the received audio data as aconsequence of the previous encoding process.
 12. The method as claimedin claim 11, wherein the existence and position of artefacts withinaudio data are detected on an on-going basis and the encoding step isadapted on an on-going basis to maximise alignment of the blockboundaries over time.
 13. The method as claimed in claims 11 or 12,wherein the analysis step includes a phase-locked loop (PLL) processwhich is attuned to detect and then lock on to block boundary artefactsin a continuous data stream.
 14. The method as claimed in claim 13,wherein the encoding step includes a second phase-locked loop or similarprocess for maximising alignment of the block boundaries of the encodingprocess with the detected block boundary artefacts gradually over time,to avoid sudden discontinuities in the block structure imposed by theencoding step.
 15. An apparatus for encoding data adapted to implementthe method according to the invention as set forth above.
 16. Anapparatus as claimed in claim 15 comprising a digital video recorder ordigital audio recorder.
 17. A method of pre-processing data receivedfrom a source (100, 105, 140), for subsequent application to an encodingprocess which imposes a structure (200, 210, 220, 230) on the data,which structure is not defined in the data as received, the methodcomprising the steps of: analysing (300) the received data to detectartefacts contained within the data indicating that the data has beenthrough a previous encoding process of the same type; extracting byanalysis of said artefacts information as to the structure imposed onthe data by said previous encoding process; processing (630) thereceived data by reference to the extracted structure information so asto maximise alignment between the structure imposed by the previousencoding process and a predetermined structure.
 18. A computer programproduct comprising instructions for causing a programmable computer toimplement the specific method steps and/or apparatus features of theinvention in any of its aspects as set forth herein.